Multiple Imputation in Two Stages
نویسندگان
چکیده
Conventional multiple imputation (MI) (Rubin, 1987) replaces the missing values in a dataset by m > 1 sets of simulated values. We describe a two-stage extension of MI in which the missing values are partitioned into two groups and imputed N = mn times in a nested fashion. Two-stage MI divides the missing information into two components of variability, lending insight when the missing values are of two qualitatively different types. It also opens new possibilities for making different assumptions about the mechanisms producing the two kinds of missing values. Point estimates and standard errors from the N complete-data analyses are consolidated by simple rules derived by analogy to nested analysis of variance. After reviewing the theory and practice of two-stage MI, we illustrate the method with a simple analysis of binary variables from a longitudinal survey.
منابع مشابه
Accuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)
Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملAn Empirical Comparison of Performance of the Unified Approach to Linearization of Variance Estimation after Imputation with Some Other Methods
Imputation is one of the most common methods to reduce item non_response effects. Imputation results in a complete data set, and then it is possible to use naϊve estimators. After using most of common imputation methods, mean and total (imputation estimators) are still unbiased. However their variances (imputation variances) are underestimated by naϊve variance estimators. Sampling mechanism an...
متن کاملEffect of Reference Population Size and Imputation Methods on the Accuracy of Imputation in Pure and Mixed Populations
Imputation as a method of creating low-density chips to high-density chips has been introduced to increase the accuracy of genomic selection in animals. In the current study, to investing imputation accuracy, three populations of mixed (scenario 1), pure (scenario 2) and mixed + pure (scenario 3) were simulated using QMSim. Two methods of imputation including Beagle and Flmpute were used fo...
متن کاملInferences for Two-Stage Multiple Imputation for Nonresponse
Multiple imputation is a common approach for handling missing data. It allows users to make valid inferences using standard complete-data methods with simple combining rules. A variation is to partition the missing data into two portions and conduct the imputation in two stages. We review two-stage multiple imputation and existing inferential methods and derive an alternative reference F -distr...
متن کاملEstimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density panel
Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predictions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated popu...
متن کامل